Skip to content

Add @helix-db/migrate tool for Supabase to HelixDB migration#862

Open
xav-db wants to merge 5 commits intodevfrom
claude/supabase-helix-migration-zqJvP
Open

Add @helix-db/migrate tool for Supabase to HelixDB migration#862
xav-db wants to merge 5 commits intodevfrom
claude/supabase-helix-migration-zqJvP

Conversation

@xav-db
Copy link
Copy Markdown
Member

@xav-db xav-db commented Feb 13, 2026

Description

This PR introduces @helix-db/migrate, a comprehensive white-glove migration tool for moving projects from Supabase (PostgreSQL) to HelixDB. The tool automates the entire migration pipeline in five phases:

  1. Introspection: Connects to a Supabase database and analyzes the schema (tables, columns, types, foreign keys, indexes, enums)
  2. Schema Generation: Converts PostgreSQL schema to HelixDB schema format (.hx files):
    • Tables → Node types
    • Foreign keys → Edge types
    • pgvector columns → Vector types with metadata
  3. Project Generation: Creates a complete HelixDB project structure with configuration and CRUD queries
  4. Data Export: Exports all data from Supabase tables as JSON files with proper type serialization
  5. Data Import: Imports exported data into a running HelixDB instance, maintaining referential integrity through ID mapping

Key Features

  • Interactive CLI with prompts for connection strings and migration options
  • Type Mapping: Comprehensive PostgreSQL to HelixDB type conversion (integers, floats, strings, dates, JSON, arrays, vectors)
  • Relationship Handling: Automatically converts foreign keys to edges with proper cardinality detection
  • Vector Support: Detects pgvector columns and generates appropriate Vector types
  • Batch Processing: Configurable batch sizes and concurrency for efficient data export/import
  • Error Handling: Detailed error reporting with row-level diagnostics
  • ID Mapping: Maintains mapping between old PostgreSQL PKs and new HelixDB IDs for edge creation
  • Flexible Modes: Supports --introspect-only, --import-only, and full migration workflows

Implementation Details

  • introspect.ts: PostgreSQL schema introspection using pg client and information_schema queries
  • generate-schema.ts: Converts introspected schema to HelixDB Node/Edge/Vector definitions with proper field mapping
  • generate-queries.ts: Generates CRUD and import queries for all generated types
  • export-data.ts: Exports table data with cursor-based pagination for large tables
  • import-data.ts: Imports data via HelixDB HTTP API with topological sorting for FK dependencies
  • type-map.ts: Comprehensive PostgreSQL to HelixDB type mapping with serialization rules
  • index.ts: Main CLI orchestration with progress tracking and user guidance

Related Issues

Closes #

Checklist when merging to main

  • No compiler warnings
  • Code is formatted
  • Code is easy to understand
  • Doc comments provided for all modules and key functions
  • TypeScript strict mode enabled

Additional Notes

The migration tool is designed to be user-friendly with:

  • Clear progress indicators using ora spinners
  • Helpful error messages with troubleshooting tips
  • Colored output for better readability
  • Detailed migration guide generation
  • Support for partial migrations (schema-only or data-only modes)

The tool handles edge cases like:

  • Supabase internal tables (automatically filtered)
  • Composite primary keys
  • Circular foreign key dependencies (topological sorting)
  • Large tables (cursor-based pagination)
  • Complex PostgreSQL types (JSON, arrays, custom types)

https://claude.ai/code/session_019kCQvvethhN4B7APgabxFy

Greptile Overview

Greptile Summary

This PR introduces @helix-db/migrate, a comprehensive CLI tool for migrating Supabase (PostgreSQL) projects to HelixDB. The tool automates schema introspection, conversion, data export, and import in five phases.

Key changes:

  • Introspects PostgreSQL schema (tables, columns, types, FKs, indexes, enums) and converts to HelixDB format
  • Generates Node types from tables, Edge types from foreign keys, and Vector types from pgvector columns
  • Exports data with cursor-based pagination and imports with topological sorting for FK dependencies
  • Creates complete HelixDB project structure with CRUD queries and migration guide

Critical security issues found:

  • SQL injection vulnerabilities in introspect.ts (lines 77, 279) where schema names are concatenated directly into queries without parameterization
  • SQL injection vulnerabilities in export-data.ts (lines 82, 105, 267) where schema/table names are interpolated without proper escaping
  • While these values come from database introspection, they should still use parameterized queries or proper identifier escaping to prevent potential attacks

Other observations:

  • TypeScript strict mode is enabled as required
  • Code is well-documented with clear comments
  • Error handling is comprehensive with helpful user messages
  • The tool properly handles edge cases like composite PKs, circular dependencies, and large tables

Important Files Changed

Filename Overview
tools/migrate/src/index.ts Main CLI orchestration with interactive prompts, handles all five migration phases with proper error handling
tools/migrate/src/introspect.ts PostgreSQL schema introspection with potential SQL injection in schema list construction
tools/migrate/src/generate-schema.ts Schema generation converts PG tables to HelixDB nodes/edges/vectors with proper field mapping
tools/migrate/src/export-data.ts Data export with cursor-based pagination, has SQL injection vulnerability in table/schema identifiers
tools/migrate/src/import-data.ts Data import with topological sorting for FK dependencies, proper ID mapping for edge creation

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI (index.ts)
    participant Intro as Introspect (introspect.ts)
    participant GenSchema as Generate Schema (generate-schema.ts)
    participant GenQueries as Generate Queries (generate-queries.ts)
    participant Export as Export Data (export-data.ts)
    participant Import as Import Data (import-data.ts)
    participant PG as PostgreSQL/Supabase
    participant Helix as HelixDB Instance

    User->>CLI: Run helix-migrate supabase
    CLI->>User: Prompt for connection string
    User->>CLI: Provide connection string

    Note over CLI,Intro: Phase 1: Introspection
    CLI->>Intro: introspectDatabase(connectionString, schemas)
    Intro->>PG: Query information_schema (tables, columns, PKs, FKs, indexes)
    PG-->>Intro: Schema metadata
    Intro->>PG: Query pg_catalog (enums, row counts)
    PG-->>Intro: Additional metadata
    Intro-->>CLI: SchemaIntrospection

    Note over CLI,GenQueries: Phase 2: Schema Generation
    CLI->>GenSchema: generateSchema(introspection)
    GenSchema->>GenSchema: Convert tables to Nodes
    GenSchema->>GenSchema: Convert FKs to Edges
    GenSchema->>GenSchema: Extract vector columns to Vectors
    GenSchema-->>CLI: GeneratedSchema (.hx files)
    CLI->>GenQueries: generateQueries(schema)
    GenQueries-->>CLI: CRUD and Import queries

    Note over CLI: Phase 3: Write Project Files
    CLI->>CLI: Write helix.toml, schema.hx, queries.hx, import.hx, MIGRATION_GUIDE.md

    CLI->>User: Prompt to export data
    User->>CLI: Confirm export

    Note over CLI,Export: Phase 4: Data Export
    CLI->>Export: exportData(connectionString, tables, outputDir)
    loop For each table
        Export->>PG: SELECT with pagination (OFFSET/LIMIT)
        PG-->>Export: Batch of rows
        Export->>Export: Transform and serialize (JSON, arrays, vectors)
        Export->>Export: Write to JSON file
    end
    Export-->>CLI: Export results

    CLI->>User: Prompt to import data into HelixDB
    User->>CLI: Confirm import

    Note over CLI,Import: Phase 5: Data Import
    CLI->>Import: importData(helixUrl, exportDir, schema, tables)
    Import->>Import: Topological sort nodes by FK dependencies
    loop For each node (sorted)
        Import->>Import: Read exported JSON file
        loop For each row (batched)
            Import->>Helix: POST /{ImportNodeQuery} with data
            Helix-->>Import: New HelixDB ID
            Import->>Import: Map old PK to new ID
        end
    end
    loop For each edge
        Import->>Import: Read source table JSON
        loop For each row with FK
            Import->>Import: Lookup from_id and to_id via ID mapping
            Import->>Helix: POST /{ImportEdgeQuery} with from_id, to_id
            Helix-->>Import: Edge created
        end
    end
    loop For each vector
        Import->>Import: Read exported JSON file
        loop For each row with vector
            Import->>Helix: POST /{ImportVectorQuery} with vector data
            Helix-->>Import: Vector created
        end
    end
    Import-->>CLI: Import results (counts, errors, ID mapping)

    CLI->>User: Migration complete! Display next steps
Loading

Last reviewed commit: e00872f

Adds @helix-db/migrate, a TypeScript CLI tool that automates migrating
Supabase projects to HelixDB. The tool introspects a Supabase Postgres
schema, auto-generates HelixDB .hx schema/query files, exports data,
and imports it into a running HelixDB instance via the HTTP API.

https://claude.ai/code/session_019kCQvvethhN4B7APgabxFy
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread tools/migrate/src/introspect.ts Outdated
schemas: string[]
): Promise<TableInfo[]> {
// Get all tables in the specified schemas
const schemaList = schemas.map((s) => `'${s}'`).join(",");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability in schema list construction. The schemas array values are concatenated directly without parameterization.

Suggested change
const schemaList = schemas.map((s) => `'${s}'`).join(",");
const schemaList = schemas.map((_, i) => `$${i + 1}`).join(",");

Then pass schemas as the second parameter to client.query() on line 79.

Comment thread tools/migrate/src/introspect.ts Outdated
client: Client,
schemas: string[]
): Promise<Record<string, string[]>> {
const schemaList = schemas.map((s) => `'${s}'`).join(",");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same SQL injection vulnerability in enum query.

Suggested change
const schemaList = schemas.map((s) => `'${s}'`).join(",");
const schemaList = schemas.map((_, i) => `$${i + 1}`).join(",");

Then pass schemas as parameter array to the query.

// For small tables, just SELECT all
if (table.rowCount <= batchSize) {
const result = await client.query(
`SELECT ${columnNames} FROM "${schema}"."${tableName}"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability with schema and table name interpolation.


while (true) {
const result = await client.query(
`SELECT ${columnNames} FROM "${schema}"."${tableName}" ORDER BY ${orderBy} LIMIT $1 OFFSET $2`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL injection vulnerability with schema, table, and order by column interpolation.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 13, 2026

Additional Comments (1)

tools/migrate/src/export-data.ts
SQL injection vulnerability. The schema and tableName values from TableInfo are interpolated directly into the query without parameterization. While these come from introspection results, they should still use proper SQL identifiers or be validated to ensure they don't contain malicious input.

Consider using pg-format library's format.ident() or validate that schema/table names only contain alphanumeric and underscore characters.

- Added a README.md file for the migration tool, providing usage instructions and setup guides.
- Updated package.json to include additional files for packaging and added a prepack script to ensure the build process runs before packaging.
- Modified importData function to accept an optional helixApiKey parameter, allowing for API key usage during data import.
- Updated various functions to utilize the helixApiKey when making API calls to HelixDB.
- Enhanced command-line options in index.ts to support helix-api-key and instance management features.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants